An adaptive MEL-LPC analysis for speech recognition
نویسندگان
چکیده
This paper describes a new speech analysis method, an adaptive Mel-LPC (AMLPC) analysis method, using human auditory characteristics. The Mel-LPC analysis method that we have proposed is an efficient time domain technique to estimate the warped predictors from input speech directly. However, the frequency resolution of spectrum obtained by Mel-LPC analysis is constant regardless of the characteristics of input speech at each analysis frame. In the AMLPC analysis, it is probable to estimate the spectrum coefficients with optimal frequency resolution according to the characteristics of the phoneme at each analysis frame, because the spectral slope or the formant is different according to phoneme (vowels, fricatives and so on). The recognition performance of melcepstrum parameters obtained by the AMLPC analysis was compared with those of mel-cepstrum parameters obtained by the conventional LPC analysis and the Mel-LPC analysis through gender-dependent phoneme and word recognition. The results show that the proposed method leads to a significant improvement in recognition accuracy over conventional LPC analysis, and a slightly improvement of error rate about 10% over the Mel-LPC analysis.
منابع مشابه
An efficient mel-LPC analysis method for speech recognition
This paper proposes a simple and e cient time domain technique to estimate an all-poll model on a mel-frequency axis (Mel-LPC). This method requires only two-fold computational cost as compared to conventional linear prediction analysis. The recognition performance of mel-cepstral parameters obtained by the Mel LPC analysis is compared with those of conventional LP mel-cepstra and the melfreque...
متن کاملEvaluation of mel-LPC cepstrum in a large vocabulary continuous speech recognition
This paper presents a simple and e cient time domain technique to estimate an all-pole model on the melfrequency scale (Mel-LPC), and compares the recognition performance of Mel-LPC cepstrum with those of both the standard LPC mel-cepstrum and the MFCC through the Japanese dictation system (Julius) with 20,000 word vocabulary. First, the optimal value of frequency warping factor is examined in ...
متن کاملAn investigation of cepstral parameterisations for large vocabulary speech recognition
We examined variants of MFCC and PLP cepstral parameterisations in the context of large vocabulary continuous speech recognition under di erent acoustical environmental conditions: Compared to MFCC, mel-frequency PLP uses a cubic root intensity-toloudness law, and an LPC analysis is applied to the mel-warped spectrum. In LPC-smoothed MFCC, the only di erence to MFCC is the additional LPC smooth...
متن کاملPerformance Evaluation of Blind Equalization for Mel-LPC based Speech Recognition under Different Noisy Conditions
This study is aimed to develop a noise robust distributed speech recognizer (DSR) for real-world applications by employing Blind Equalization (BEQ) for robust feature extraction. The main focus of the work is to cope with different noisy environments in recognition phase. To realize this objective, Mel-LP based speech analysis has been used in speech coding on the linear frequency scale by appl...
متن کاملExperiments on a parametric nonlinear spectral warping for an HMM-based speech recognizer
This paper is concerned with the search for an optimal feature-set for a speech recognition system. A better acoustic feature analysis that suitably enhances the semantic information in a consistent fashion can reduce raw-score (no grammar) error rate sig-niicantly. A simple two-dimensional parameterized feature set is proposed. The feature-set is compared against a standard mel-cepstrum, LPC-b...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2004